809 research outputs found
Game-theoretical control with continuous action sets
Motivated by the recent applications of game-theoretical learning techniques
to the design of distributed control systems, we study a class of control
problems that can be formulated as potential games with continuous action sets,
and we propose an actor-critic reinforcement learning algorithm that provably
converges to equilibrium in this class of problems. The method employed is to
analyse the learning process under study through a mean-field dynamical system
that evolves in an infinite-dimensional function space (the space of
probability distributions over the players' continuous controls). To do so, we
extend the theory of finite-dimensional two-timescale stochastic approximation
to an infinite-dimensional, Banach space setting, and we prove that the
continuous dynamics of the process converge to equilibrium in the case of
potential games. These results combine to give a provably-convergent learning
algorithm in which players do not need to keep track of the controls selected
by the other agents.Comment: 19 page
A phase transition for measure-valued SIR epidemic processes
We consider measure-valued processes that solve the following
martingale problem: for a given initial measure , and for all smooth,
compactly supported test functions , \begin{eqnarray*}X_t(\varphi
)=X_0(\varphi)+\frac{1}{2}\int _0^tX_s(\Delta \varphi )\,ds+\theta
\int_0^tX_s(\varphi )\,ds\\{}-\int_0^tX_s(L_s\varphi )\,ds+M_t(\varphi
).\end{eqnarray*} Here is the local time density process associated
with , and is a martingale with quadratic variation
. Such processes arise as scaling
limits of SIR epidemic models. We show that there exist critical values
for dimensions such that if
, then the solution survives forever with positive
probability, but if , then the solution dies out in finite
time with probability 1. For we prove that the solution dies out almost
surely for all values of . We also show that in dimensions the
process dies out locally almost surely for any value of ; that is, for
any compact set , the process eventually.Comment: Published in at http://dx.doi.org/10.1214/13-AOP846 the Annals of
Probability (http://www.imstat.org/aop/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Supersymmetric Ward Identities and NMHV Amplitudes involving Gluinos
We show how Supersymmetric Ward identities can be used to obtain amplitudes
involving gluinos or adjoint scalars from purely gluonic amplitudes. We obtain
results for all one-loop six-point NMHV amplitudes in \NeqFour Super
Yang-Mills theory which involve two gluinos or two scalar particles. More
general cases are also discussed.Comment: 32 pages, minor typos fixed; one reference adde
Staff Perceptions of Standards-Based Grading Prior To Implementation
The purpose of this qualitative study was to evaluate the perceptions of a group of middle school teachers regarding changing to standards-based grading (SBG). Data were collected from the transcripts of two different focus groups and analyzed. Study results indicated that SBG measures were not well known by all staff, and many clear resistance points were present. Resistance points centered around five key themes: fear of loss of rigor, community pushback, lack of SBG practices knowledge, lack of supporting infrastructure, and extra time and work required. Recommendations that flow from these results are that, prior to implementing SBG, comprehensive data be collected regarding staff beliefs about grading and reporting in general, and that targeted, differentiated professional development be planned for staff based upon the results of the data collected. Continuing to expand SBG practices within schools is the ultimate goal due to the large body of research espousing its benefits
Asynchronous Stochastic Approximation with Differential Inclusions
The asymptotic pseudo-trajectory approach to stochastic approximation of
Benaim, Hofbauer and Sorin is extended for asynchronous stochastic
approximations with a set-valued mean field. The asynchronicity of the process
is incorporated into the mean field to produce convergence results which remain
similar to those of an equivalent synchronous process. In addition, this allows
many of the restrictive assumptions previously associated with asynchronous
stochastic approximation to be removed. The framework is extended for a coupled
asynchronous stochastic approximation process with set-valued mean fields.
Two-timescales arguments are used here in a similar manner to the original work
in this area by Borkar. The applicability of this approach is demonstrated
through learning in a Markov decision process.Comment: 41 page
Best-response Dynamics in Zero-sum Stochastic Games
We define and analyse three learning dynamics for two-player zero-sum discounted-payoff stochastic games. A continuous-time best-response dynamic in mixed strategies is proved to converge to the set of Nash equilibrium stationary strategies. Extending this, we introduce a fictitious-play-like process in a continuous-time embedding of a stochastic zero-sum game, which is again shown to converge to the set of Nash equilibrium strategies. Finally, we present a modified δ-converging best-response dynamic, in which the discount rate converges to 1, and the learned value converges to the asymptotic value of the zero-sum stochastic game. The critical feature of all the dynamic processes is a separation of adaption rates: beliefs about the value of states adapt more slowly than the strategies adapt, and in the case of the δ-converging dynamic the discount rate adapts more slowly than everything else
Mixed-strategy learning with continuous action sets
Motivated by the recent applications of game-theoretical learning to the design of distributed control systems, we study a class of control problems that can be formulated as potential games with continuous action sets. We propose an actor-critic reinforcement learning algorithm that adapts mixed strategies over continuous action spaces. To analyse the algorithm we extend the theory of finite-dimensional two-timescale stochastic approximation to a Banach space setting, and prove that the continuous dynamics of the process converge to equilibrium in the case of potential games. These results combine to give a provablyconvergent learning algorithm in which players do not need to keep track of the controls selected by other agents
- …